内窥镜检查是空心器官内最广泛使用的癌症和息肉检测的医疗技术。但是,由于启蒙源方向,内窥镜获得的图像经常受到照明人工制品的影响。当内窥镜的光源姿势突然变化时,存在两个主要问题:产生过度曝光和不受欢迎的组织区域。这两种情况可能导致因影响区域缺乏信息而导致误诊,或者在非侵入性检查过程中使用了各种计算机视觉方法的性能(例如,大满贯,运动结构,光流,光流)。这项工作的目的是两倍:i)引入一种由生成对抗技术生成的新合成生成的数据集和ii),并探索在过度暴露和未渗透的照明中探索基于浅层和深度学习的基于浅的基于学习的图像增强方法条件。除了在7.6 fps左右的运行时间外,还通过基于深网的LMSPEC方法获得了最佳定量结果(即基于公制的结果)
translated by 谷歌翻译
在此贡献中,我们使用一种合奏深度学习方法来组合两个单个单阶段探测器(即Yolov4和Yolact)的预测,目的是检测内窥镜图像中的伪像。这种整体策略使我们能够改善各个模型的鲁棒性,而无需损害其实时计算功能。我们通过训练和测试两个单独的模型和各种集合配置在“内窥镜伪影检测挑战”数据集中证明了方法的有效性。广泛的实验表明,在平均平均精度方面,合奏方法比单个模型和以前的作品的优越性。
translated by 谷歌翻译
We improve the understanding of the $\textit{golden ratio algorithm}$, which solves monotone variational inequalities (VI) and convex-concave min-max problems via the distinctive feature of adapting the step sizes to the local Lipschitz constants. Adaptive step sizes not only eliminate the need to pick hyperparameters, but they also remove the necessity of global Lipschitz continuity and can increase from one iteration to the next. We first establish the equivalence of this algorithm with popular VI methods such as reflected gradient, Popov or optimistic gradient descent-ascent in the unconstrained case with constant step sizes. We then move on to the constrained setting and introduce a new analysis that allows to use larger step sizes, to complete the bridge between the golden ratio algorithm and the existing algorithms in the literature. Doing so, we actually eliminate the link between the golden ratio $\frac{1+\sqrt{5}}{2}$ and the algorithm. Moreover, we improve the adaptive version of the algorithm, first by removing the maximum step size hyperparameter (an artifact from the analysis) to improve the complexity bound, and second by adjusting it to nonmonotone problems with weak Minty solutions, with superior empirical performance.
translated by 谷歌翻译
A challenge in spoken language translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we fine-tune a general-purpose, large language model to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We compare to several segmentation strategies and find that our approach improves BLEU score on three languages by an average of 2.7 BLEU overall compared to an automatic punctuation baseline. Further, we demonstrate the effectiveness of two constrained decoding strategies to improve well-formedness of the model output from above 99% to 100%.
translated by 谷歌翻译
Self-supervised learning (SSL) learns useful representations from unlabelled data by training networks to be invariant to pairs of augmented versions of the same input. Non-contrastive methods avoid collapse either by directly regularizing the covariance matrix of network outputs or through asymmetric loss architectures, two seemingly unrelated approaches. Here, by building on DirectPred, we lay out a theoretical framework that reconciles these two views. We derive analytical expressions for the representational learning dynamics in linear networks. By expressing them in the eigenspace of the embedding covariance matrix, where the solutions decouple, we reveal the mechanism and conditions that provide implicit variance regularization. These insights allow us to formulate a new isotropic loss function that equalizes eigenvalue contribution and renders learning more robust. Finally, we show empirically that our findings translate to nonlinear networks trained on CIFAR-10 and STL-10.
translated by 谷歌翻译
Mechanistic cardiac electrophysiology models allow for personalized simulations of the electrical activity in the heart and the ensuing electrocardiogram (ECG) on the body surface. As such, synthetic signals possess known ground truth labels of the underlying disease and can be employed for validation of machine learning ECG analysis tools in addition to clinical signals. Recently, synthetic ECGs were used to enrich sparse clinical data or even replace them completely during training leading to improved performance on real-world clinical test data. We thus generated a novel synthetic database comprising a total of 16,900 12 lead ECGs based on electrophysiological simulations equally distributed into healthy control and 7 pathology classes. The pathological case of myocardial infraction had 6 sub-classes. A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity. The ECG database is split into training, validation, and test folds for development and objective assessment of novel machine learning algorithms.
translated by 谷歌翻译
线性神经网络层的模棱两可。在这项工作中,我们放宽了肩variance条件,只有在投影范围内才是真实的。特别是,我们研究了投射性和普通的肩那样的关系,并表明对于重要的例子,这些问题实际上是等效的。3D中的旋转组在投影平面上投影起作用。在设计用于过滤2D-2D对应的网络时,我们在实验上研究了旋转肩位的实际重要性。完全模型的模型表现不佳,虽然简单地增加了不变的特征,从而在强大的基线产量中得到了改善,但这似乎并不是由于改善的均衡性。
translated by 谷歌翻译
成功的材料选择对于设计和制造产品的设计自动化至关重要。设计师通过通过性能,制造性和可持续性评估选择最合适的材料来利用他们的知识和经验来创建高质量的设计。智能工具可以通过提供从先前的设计中学到的建议来帮助具有不同专业知识的设计师。为了实现这一目标,我们介绍了一个图表表示学习框架,该框架支持组装中身体的物质预测。我们将材料选择任务作为节点级预测任务,对CAD模型的汇编图表示,并使用图形神经网络(GNN)对其进行处理。在Fusion 360画廊数据集上执行的三个实验协议的评估表明我们的方法的可行性,达到了0.75 TOP-3 Micro-F1分数。提出的框架可以扩展到大型数据集,并将设计师的知识纳入学习过程。这些功能使该框架可以作为设计自动化的推荐系统以及未来工作的基准,从而缩小了人类设计师与智能设计代理之间的差距。
translated by 谷歌翻译
心室心动过速(VT)可能是全世界425万人心脏死亡的原因之一。治疗方法是导管消融,以使异常触发区域失活。为了促进和加快消融过程中的定位,我们提出了基于卷积神经网络(CNN)的两种新型定位技术。与现有方法相反,例如使用ECG成像,我们的方法被设计为独立于患者特异性的几何形状,直接适用于表面ECG信号,同时还提供了二元透射位置。一种方法输出排名的替代解决方案。可以在通用或患者的几何形状上可视化结果。对CNN进行了仅包含模拟数据的数据集培训,并在模拟和临床测试数据上进行了评估。在模拟数据上,中值测试误差低于3mm。临床数据上的中位定位误差低至32mm。在所有临床病例中,多达82%的透壁位置被正确检测到。使用排名的替代溶液,在临床数据上,前3个中值误差下降到20mm。这些结果证明了原理证明使用CNN来定位激活源,而无需固有的患者特定的几何信息。此外,提供多种解决方案可以帮助医生在多个可能的位置中找到实际激活源。通过进一步的优化,这些方法具有加快临床干预措施的高潜力。因此,他们可以降低程序风险并改善VT患者的结局。
translated by 谷歌翻译
具有波束成型的天线阵列在较高的载波频率下克服了高空间路径损耗。但是,必须正确对齐光束,以确保用户设备(UE)辐射(并接收)最高功率。尽管有一些方法可以通过某种形式的层次搜索来详尽地搜索最佳光束,但它们可能很容易返回具有小型梁增益的本地最佳解决方案。其他方法通过利用上下文信息(例如UE的位置或来自相邻基站(BS)的信息的位置)来解决此问题,但是计算和传达此附加信息的负担可能很高。迄今为止,基于机器学习的方法受到随附的培训,性能监控和部署复杂性的影响,从而阻碍了其规模的应用。本文提出了一种解决初始光束发现问题的新方法。它是可扩展的,易于调整和实施。我们的算法基于一个推荐系统,该系统基于培训数据集将组(即UES)和偏好(即来自代码簿中的光束)关联。每当需要提供新的UE时,我们的算法都会返回此用户群集中的最佳光束。我们的仿真结果证明了我们方法的效率和鲁棒性,不仅在单个BS设置中,而且在需要几个BS之间协调的设置中。我们的方法在给定任务中始终优于标准基线算法。
translated by 谷歌翻译